Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MNT: add EntryVisitor and refactor search methods to use SearchVisitor #110

Draft
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

shilorigins
Copy link
Collaborator

@shilorigins shilorigins commented Nov 26, 2024

WIP

  • SearchVisitor is not a true generator, it just yields from a list to satisfy the existing generator handling
    • I'm working on seeing if I can make SearchVisitor a generator without requiring all EntryVisitors to be generators
    • Otherwise, I can adjust the .visit and .accept calls to support generators for visitors that want to use them

Description

Implement EntryVisitor abstract class, with corresponding Entry.accept methods
Implement SearchVisitor as the first concrete visitor
Add ancestor search option to SearchVisitor

Relates to #106

Motivation and Context

Using a visitor pattern will let us decouple Entry DAG traversal from specific components, such as the Client or concrete _Backend subclasses. This should make it easier to do things like collect and search entries, fill UUIDs, and save snapshots without having to re-implement traversal logic in multiple places.

How Has This Been Tested?

Existing search tests cover the new SearchVisitor.

Where Has This Been Documented?

Pre-merge checklist

  • Code works interactively
  • Code follows the style guide
  • Code contains descriptive docstrings, including context and API
  • New/changed functions and methods are covered in the test suite where possible
  • Test suite passes locally
  • Test suite passes on GitHub Actions
  • Ran docs/pre-release-notes.sh and created a pre-release documentation page
  • Pre-release docs include context, functional descriptions, and contributors as appropriate

@shilorigins shilorigins self-assigned this Nov 26, 2024
@shilorigins shilorigins changed the title Devagr/entry visitor MNT: add EntryVisitor and refactor search methods to use SearchVisitor Nov 26, 2024
Copy link
Contributor

@tangkong tangkong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the presence and differentiation between .visit() and .visitEntryType() methods is throwing me off here, left a few rather confused comments. Perhaps consolidating everything with single-dispatch will make things simpler / smoother

entry.accept(self)
return

def visitParameter(self, parameter: Parameter) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can possibly use functools.singledispatch here to consolidate these methods. (as I familiarize myself more with this pattern I realize that's exactly what we should do)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually confused by the presence of these two types of visit method, really all .visit() is doing here is calling Entry.accept(), which ultimately calls a .visit*() method? (In a sort of triple dispatch back and forth). Maybe all the .visit()'s are meant to be interchangeable?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like I made up .visit without realizing. There's a little more discussion in the below thread.

yield entry
visitor = SearchVisitor(self, *search_terms)
root = self.root
visitor.visit(root)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing this elsewhere as starting with an "accept" call, such as root.acept(visitor). These are functionally equivalent right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, looks like I made up .visit without realizing. It does give us a benefit though: we can use .visit for tracking state during traversal. For example, SearchVisitor adds the Entry to the path and pops it off the path in .visit because .visit doesn't return until all of the reachable entries have been traversed, whereas .visit* return before the reachable entries are traversed. Without .visit, we might have to move the traversal logic into .visit*, so we can track the state we need.

I'll try out removing .visit and seeing what changes need to be made.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, a sequence diagram and class diagram from the source of all my knowledge: https://en.wikipedia.org/wiki/Visitor_pattern#Structure

self.path = []
self.matches = []

def _check_match(self, entry: Union[Entry, Root]) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def _check_match(self, entry: Union[Entry, Root]) -> bool:
def _check_match(self, entry: Union[Entry, Root]) -> None:

Because it confused me as I was reading this.

This would be the place to expose any entry that matches, and I think it might be worth bubbling that up and making visit yield as well. I can imagine other visitors gathering PVs or specific entries etc, but not every node we visit will yield a result (Entries that don't match). So in the end maybe only EntryVisitor.visit returns a generator and whatever other worker methods return values as necessary?

In untested skeleton code I'm imagining here (consolidating all the visit_ methods):

    def visit(self, entry: Union[Entry, Root, UUID]) -> None:
        if isinstance(entry, UUID):
            entry = self.backend.get_entry(entry)
        self.path.append(entry)

        # Perform work
        result = self._check_match(entry)
        if result is not None:
            yield result

        # visiting entry's children handled by Entry.accept() calls, previously in super().visit(entry)
        # but we get to visit via an `Entry.accept()` call, so why do we call accept again here?

        self.path.pop()

Excuse my general confusion here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically how I was thinking to use a generator. Per the above thread, if we remove .visit then the generator might become .accept, but the idea is basically the same.

RE consolidating .visit*: In SearchVisitor all of the .visit* methods are identical because we searches treat the different Entry subclasses the same: we check if they match the search terms. Other visitors might treat the subclasses differently, like SnapVisitor only saving data for Parameters. So in general we can't consolidate the .visit* methods, but in some specific visitors they'll be identical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that the visitors want to differentiate between Entry subclasses, I was arguing more for doing that via single dispatch and overloading the .visit() method.

In that case we won't have to make the Entry.accept methods pick the correct .visit method, the Visitor will get the correct one at runtime

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants